A Workbench for Acquiring Semantic Information and Constructing Dictionary for Compound Noun Analysis
نویسندگان
چکیده
This paper describes a workbench system for constructing a dictionary to interpret compound nouns, which integrates the acquisition of semantic information and interpretation of compound nouns. First, we extract semantic information from a machine readable dictionary and corpora using regular expressions. Then, the semantic relation of compound nouns are interpreted based on semantic relations, semantic features extracted automatically, and subcategorization information according to the characteristics of a head noun, i.e. attributive or predicative. Experimental results show that our method using hybrid knowledge depending on the characteristics of a head noun improves the accuracy rate by 40.30% and the coverage rate by 12.73% better than previous researches using semantic relations extracted from MRDs. As compound nouns are highly productive and their interpretation requires hybrid knowledge, we propose a workbench for compound noun interpretation in which necessary knowledge such as semantic patterns, semantic relations, and interpretation instances can be extended, rather than assuming a pre-defined lexical knowledge.
منابع مشابه
A Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملAn Analysis of Persian Compound Nouns as Constructions
In Construction Morphology (CM), a compound is treated as a construction at the word level with a systematic correlation between its form and meaning, in the sense that any change in the form is accompanied by a change in the meaning. Compound words are coined by compounding templates which are called abstract schemas in CM. These abstract constructional schemas generalize over sets of existing...
متن کاملLinguistically Light Lexical Extensions for Ontologies
An increasing number of enterprises are beginning to include semantic web ontologies into their Information Extraction (IE) and Text Analytics (TA) applications. This can be challenging for a TA group wishing to avail of semantic web ontologies due to the manual effort of retargeting and tailoring language resources within the TA system to a new domain to meet customer needs. A lightweight lexi...
متن کاملConstruction of Japanese Nominal Semantic Dictionary using “A NO B” Phrases in Corpora
This paper describes a method of constructing Japanese nominal semantic dictionary, which is indispensable for text analysis, especially for indirect anaphora resolution. The main idea is to use noun phrases of “A NO(postposition) B” in corpora. Two nouns A and B in “A NO B” can have several semantic relations. By collecting “A NO B” phrases form corpora, analyzing their semantic relations, and...
متن کاملSegmentation of Compound Nouns using Composite Mutual Information
In Korean, a compound noun may be freely formed with or without spaces between simple nouns. The exible word formation rule of Korean raises a serious problem in processing compound nouns with computers, in particular, in searching a dictionary with the compound noun as a search key. This paper describes a corpus-based method for segmenting a compound noun into simple nouns. Segmentation is per...
متن کامل